The Cost Function

Contents

(A) The Cost Function
(B) The Derived Demand for Factors
(i) Factor Price Effects
(ii) Output Effects
(C) Costs and Returns to Scale
(D) Factor Price Frontiers

(A) The Cost Function

The cost-minimizing choice of inputs depended on two essential sets of parameters: the given output level (Y) and the given factor prices (r and w). It is obvious that if we changed relative factor prices, the cost-minimizing choice of inputs would change. Consider Figure 8.1 At factor price r₁/w₁, the cost-minimizing input choice is K₁, L₁, represented by point e₁, at the tangency of the C₁ isocost curve and the Y* isoquant. Now, suppose that the rental rate of capital fell and the wage level rose, so that the isocost curve at e₁ is now C₂, which has slope -r₂/w₂, where r₂/w₂ < r₁/w₁. Obviously, e₁ no longer represents the cost-minimizing input choice. Instead, the cost-minimizing producer would prefer to change technique and move to a point such as e₂, at the tangency of the isocost curve C₂ขand the isoquant Y*. The new choice of inputs, K₂, L₂, is considerably different: namely, more capital and less labor will be hired at e₂, relative to before.

cost1.gif (4224 bytes)

Figure 8.1 - A Change in Factor Prices

Points e₁ and e₂ in Figure 8.1 both represent cost-minimizing points, albeit at different factor prices. However, one question remains: in the move from e₁ to e₂, have costs risen or fallen? This we cannot tell directly from the picture in Figure 8.1, as the isocost curves C₁ and C₂ขare not obviously comparable. However, in principle, they should be: C₁ are the total costs at e₁ at the old prices r₁/w₁, while C₂ขare the total costs at e₂ at the new prices, r₂/w₂. Both C₁ and C₂ขare numbers, thus we should be able to say whether total costs at e₁ are higher or lower than total costs at e₂ depending on whether C₁ is greater or less than C₂ข.

Consequently, we should be able to say whether factor prices r₁/w₁ yield higher or lower total costs than factor prices r₂/w₂ by comparing the relevant costs at their respective cost-minimizing points, in this case, C₁ and C₂ข. Thus, we can trace out what can be called a minimum cost function (or simply, a cost function), C = C(r, w, Y*), representing the different minimum costs yielded by different factor price and output configurations. As noted, these costs are evaluated at the relevant cost-minimizing choice of inputs, thus C₁ and C₂ขin Figure 8.1 would be included in the cost function, but C₂would not.

Of course, the output level Y* one of the parameters of the cost-minimization story, must be included in the cost function. Its inclusion helps us connect the cost function from the cost-minimization Paretian story with the cost function of the scale-theoretic Marshallian story. However, we shall postpone the Marshallian themes (e.g. long run versus short-run cost functions), and just outline some of the properties of the cost function we have here.

For generality, we shall rewrite the cost function simply as C = C(w, y), where w represents a vector of factor prices and y represents the given level of output. In this way, the cost function can be written as:

C(w, y) = min_x wทx

s.t. x ฮ V(y)

where V(y) is the input requirement set formed by the isoquant of the desired y.

The cost function and its analysis is due largely to the famous work of Paul Samuelson (1947) and Ronald Shephard (1953) [note: John Hicks (1939) obtained most of these relationships in the context of a consumer expenditure function]. Its general properties are the following:

(1) Non-negativity: C(w, y) > 0 for w > 0 and y > 0

(2) No fixed costs: C(w, 0) = 0

(3) Monotonicity in y: if yข ณ y, then C(w, yข ) ณ C(w, y)

(4) Monotonicity in w: if wข ณ w, then C(wข , y) ณ C(w, y)

(5) Homogeneity of degree one in prices: C(l w, y) = l C(w, y)

(6) Concavity: C(w, y) is concave in w.

(7) Continuity: C(w, y) is continuous in w.

(8) Shephard's Lemma: If C(w, y) is differentiable, then there is a unique vector x, such that ถ C(w, y)/ถ w_i = x_i.

The explanation of these properties are easily enumerated. Property (1) simply states that in order to produce positive output (y > 0) and if factors are not free, then costs will be incurred. Property (2) states that the cost-minimizing choice of output to produce nothing will cost nothing. Notice that this is equivalent to saying that there are no fixed costs, i.e. costs incurred before one even begins producing. Property (3) is straightforward enough: if the required level of output rises (i.e. isoquant Y* shifts up to the northeast), then, everything else constant, the total costs incurred at the cost-minimizing point will be higher.

Property (4), which claims that increasing any one factor return will increase costs can be deduced by simple revealed choice reasoning. Let w and wข be two factor price vectors such that wข ณ w. Suppose that at factor prices w, x was the cost-minimizing input choice (thus, C(w, y) = wx) while at price wข , xข was the cost-minimizing input choice (thus C(w, y) = wข xข ). Now, as wข ณ w, then at factor prices w, when x was chosen, xข might have also been available but it was not chosen. This implies that wx ฃ wxข , otherwise x would not have been the cost-minimizing choice of inputs. But as wข ณ w, then we have wxข ฃ wข xข . Thus, combining inequalities, we see that wx ฃ wข xข , which translates to C(w, y) ฃ C(wข , y). Thus, inequality (4).

Property (5), which establishes the homogeneity of degree 1 of the cost function (doubling all factor prices, doubles total costs), is also straightforward. Suppose, in our canonical example, we increased both factor prices r and w by the scalar l . Then costs change from C = wL + rK to Cข = l wL + l rK. If L and K do not change, then we see immediately that Cข = l C. However, it is evident that L and K will not change. Specifically, recall that the slope of the isocost function is -r/w. Increasing both prices by the scalar l , the slope remains unchanged as -l r/l w = -r/w. Thus, the cost-minimizing choice of inputs, L and K, will not change. Thus, the only thing that changes are the numerical value of total costs, which rise from wL + rK = C to l wL + l rK = l C. Thus, the homogeneity of the cost function.

Property (6), the concavity of the cost function, can be understood via the use of Figure 8.2. We have drawn two cost functions, C*(w, y) and C(w, y), where total costs are mapped with respect to one factor price, w_i. All other factor prices and the output level are being held constant.

Suppose that we have Leontief, no-substitution production technology, so that the cost-minimizing point is always a particular input combination, call it x*, regardless of the factor prices. The corresponding cost function is shown in Figure 8.2 by C*(w, y). Now, because of our Leontief technology, we have a fixed cost-minimizing bundle x* throughout, so as we increase the rental rate of the ith factor, w_i, the total costs of the bundle increase linearly. This becomes obvious when we note that the cost of the bundle x* at any particular set of factor prices w is:

C*(w, y) = wx* = w_ix_i* + ๅ _j=1^m-1w_jx_j*

where w_ix_i* are the total payments to the ith factor and ๅ _j=1^m-1w_jx_j* are the payments to the other factors. Thus, increasing only w_i will increase total costs wx* linearly, thus C*(w, y) is a linear function, as depicted in Figure 8.2.

cost2.gif (3568 bytes)

Figure 8.2 - Cost Function with respect to one factor price

Now, suppose that the rental rate of ith factor is zero, w_i = 0. Although the ith factor may be free (so w_ix_i* = 0), the other factors are still costly (ๅ _j=1^m-1w_jx_j* > 0). Thus the costs of producing bundle x* are positive, i.e. w⁰x* > 0, as shown by point d* in Figure 8.2. As the rental rate of the ith factor increases, the costs of bundle x* increase linearly. As we see in Figure 8.2, when w_i = w_i*, costs are w*x* (point e*) and when w_i = w_iข, costs are wขx* (point f*).

Now, let us increase the degree of substitutability, i.e. let us move away from the Leontief technology and allow there to be different cost-minimizing input bundles at different factor prices. We propose that the new cost function will actually look like the concave function C(w, y) in Figure 8.2. To see why, let us suppose that when we have factor prices w* (and thus w_i = w_i*), then the old bundle x* is still the cost-minimizing bundle. Thus, w*x* is cost-minimizing, thus both the old cost function C*(w, y) and the new one C(w, y) share point e* in Figure 8.2.

However, from point e*, let us suppose the price of the ith factor rises from w_i* to w_iข. In the old cost-function, where Leontief technology forced the producer to be stuck with input bundle x*, all that would happen would be that costs would increase from w*x* to wข x*. However, now that there is a degree of substitability, producers will seek another cost-minimizing bundle xข . It should be obvious that the costsof the new cost-minimizing bundle are not going to be more than the costs incurred if the producer could not choose a new bundle. In other words, wข xข ฃ wข x* thus, as shown in Figure 8.2, point f lies below f*. This reflects the very simple idea of subtitutability: when given the choice, the producer will choose an input combination xข with lower costs than x* by substituting away from the factor whose costs have risen (in this case, the ith factor). Thus, at least for w_i > w_i*, the cost function C(w, y) will lie below the Leontief cost function C*(w, y). An analogous reasoning applies for points below w_i < w_i*. Thus, the general concavity of the cost function, C(w, y).

Let us turn to a formal demonstration. Let w, wข be two factor price vectors and let x and xข be the corresponding cost-minimizing factor bundles. Let us define w⁰ = l w + (1-l )wข , where l ฮ (0, 1), thus w⁰ is a convex combination of factor prices w and wข . Consider now another input factor bundle, x⁰ which is cost-minimizing at prices w⁰. By the convexity of the isoquants, it is obvious that x⁰ was available at the other factor prices w and wข , but was not chosen at those prices. Thus, by cost-minimization, the following inequalities hold for factor prices w and wข :

wx ฃ wx⁰

wข xข ฃ wข x⁰

Multiplying the first by l and the second by (1-l ), and then adding the inequalities, we see immediately that:

l wx + (1-l )wข xข ฃ l wx⁰+ (1-l )wข x⁰

= (l w + (1-l )wข )x⁰

= w⁰x⁰

by definition. But as wx = C(w, y), wข xข = C(wข , y) and w⁰x⁰ = C(w⁰, y), then we see immediately that this inequality translates into:

l C(w, y) + (1-l )C(wข , y) ฃ C(w⁰, y).

Or, more obviously:

l C(w, y) + (1-l )C(wข , y) ฃ C(l w + (1-l )wข , y)

thus implying that the cost-minimizing function C(ท, y) is concave in w. Thus, Property 6.

Property 7 follows simply by the assumption of the convexity of the isoquants and linearity of the cost function: as factor prices change continuously, there is continuous substitution along the isoquants and by linearity, the costs change continuously.

Property 8 is the famous Shephard's Lemma, ถ C(w, y)/ถ w_i = x_i. A simple proof employs the envelope theorem. Total costs are C(w, y) = wทx = ๅ _i=1^m w_ix_i. Taking the total derivative with respect to w_i, we obtain:

ถ C(w, y)/ถ w_i = x_i + ๅ _j=1^m w_jท(ถ x_j/ถ w_i)

(recall that w_j are given, thus they do not change with respect to w_i). We want to apply the envelope theorem, which will basically make the entire summation term disappear and leave only x_i. Now, recall that x_j is the solution to the cost-minimization problem, thus recall that the first order condition implies that w_j = l ฆ _j. Now, recall also that we required that at the solution, y* = ฆ (x). Differentiating this with respect to w_i, we obtain:

0 = ๅ _j=1^m ฆ _jท(ถ x_j/ถ w_i)

As the first order condition implies ฆ _j = w_jlfor all j = 1, 2, .., m, then we can rewrite this as:

0 = (1/l ) ๅ _j=1^m w_jท(ถ x_j/ถ w_i)

which, if l is non-zero and finite, implies that the entire summation term in our earlier equation is zero. Thus, ถ C(w, y)/ถ w_i = x_i, which is what was sought. [Note: although named after Shephard (1953), who gave a complete proof using the distance function, we nonetheless see it in John Hicks (1939: p.331) and Paul Samuelson (1947: p.68)]

The reasoning can be restated intuitively this way: suppose that given factor prices w, the bundle x is the cost-minimizing bundle. Increasing the price of the ith factor marginally (i.e. by one dollar) and allowing for no substitution so that the x remains the cost-minimizing bundle (the implication of the envelope theorem), then it is obvious that the total costs of the bundle will increase only by the amount which spending on the ith factor increases. Now, at the previous factor prices w, we were spending w_ix_i on the ith factor. Consequently, a rise in factor prices by a dollar, will raise total costs by x_i. More heuristically, if at prices w, w_i = $5 and x_i = 100, then total expenditure on the ith factor was w_ix_i = $500. Increasing w_i from $5 to $6 without changing the bundle (so x_i = 100 still), then we now have total expenditure w_iขx_i = $600. Thus, the change in expenditure on the ith factor when we increased factor prices by a dollar is $100, i.e. precisely the amount of the x_i factor employed, converted to dollars. Thus, ถ C(w, y)/ถ w_i = 100 = x_i.

(B) The Derived Demand for Factors

In the cost-minimization and output-maximization exercises, we were able to determine the input combinations chosen by the producer. We noticed that the choice of input combinations depends on two sets of parameters - the factor prices (w) and the desired output level (y). Consequently, we can define the input combinations chosen by the firm in response to factor prices and the desired output levels as the compensated demand for factors. Specifically, these demand functions are merely the arguments that minmize the cost function, so they can be written succinctly as:

x(p, w) = arg min_x wx

s.t. x ฮ V(y)

where V(y) is the input requirement set of the desired y. One must remember that x(p, w) is a vector of functions, thus the demand for a particular factor x_i(p, w) is merely one of the entries.

The producer's demand for the ith factor, x_i(p, w), is a function of rental rates and the desired output level. For our canonical case, we would infer from the first order conditions a function K^d = K(r, w, Y*) as capital demand function and L^d = L(r, w, Y*) as the labor demand function of the producer.

What are the properties of these factor demand functions? In particular, what happens to factor demands when particular input prices or output levels change? The major tool for this is Shephard's Lemma, which stated that ถ C(w, y)/ถ w_i = x_i. This resulting x_i is precisely the demand for the factor i at factor prices w and output level y. Thus, by Shephard's Lemma, we can analyze the properties of the factor demand functions merely by examining the properties of the first derivative of the cost function. We shall be doing this throughout.

(i) Factor Price Effects

We can delineate the factor price properties of the compensated factor demand functions as follows:

(1) Negative own-price effect: ถ x_i(p, w)/ถ w_i ฃ 0

(2) Symmetric cross-price effects: ถ x_i(p, w)/ถ w_j = ถ x_j(p, w)/ถ w_i

(3) Homogeneity of degree zero in factor prices: x(l w, y) = x(w, y).

Property (1) is the basic proposition that the factor demand curve is downward-sloping, i.e. a rise in the price of a factor will lead to a decline in the demand for it. Diagramatically, we have already seen in Figure 8.1 when we reduced the factor price ratio r/w, the cost-minimizing input bundle moved from e₁ (high labor, low capital) to e₂ (low labor, high capital). In other words, we saw that as the wage rose relative to the rental rate on capital, the demand for labor fell while the demand for capital rose. Thus, at least in the simple diagrammatic case of Figure 8.1, an increase in the factor price will lead to a fall in the demand for that factor by the producer.

To prove this more generally, consider first the impact of a change in the rental rate of the ith factor on that demand for for that factor, i.e. ถ x_i(w, y)/ถ w_i. Applying Shephard's Lemma we should recognize immediately that as x_i is the partial derivative of the cost function with respect to w_i, then ถ x_i/ถ w_j is the second partial derivative of the cost function, i.e.

ถ ²C(w, y)/ถ w_i² = ถ x_i(w, y)/ถ w_i.

Now, recall that one of the properties of cost functions were their concavity with respect to individual factor prices. This implies that ถ ²C(w, y)/ถ w_i² ฃ 0, thus:

ถ x_i(w, y)/ถ w_i ฃ 0

so that a rise in the ith factor price will reduce the demand for that factor, precisely the result we obtained diagramatically in Figure 8.1.

Property (2) follows by virtually the same logic. As by Shephard's Lemma ถ x_i(w, y)/ถ w_j = ถ ²C(w, y)/ถ w_iถw_j. Now, Young's Theorem tells us that ถ ²C(w, y)/ถ w_iถw_j = ถ ²C(w, y)/ถ w_jถw_i. But we know by Shephard's Lemma that ถ ²C(w, y)/ถ w_jถw_i = ถ x_j(p, w)/ถ w_i. Thus:

ถ x_i(w, y)/ถ w_j = ถ x_j(w, y)/ถ w_i

i.e. in the margin, the effect of a rise in price of factor j on the demand for factor i is the same as the effect of a rise in the price of factor i on the demand for factor j. This symmetry of cross-effects is not very economically intuitive, but it follows through.

However, although cross-price elasticities are not symmetric. Specifically, we can define the elasticity of demand for factor i with respect to price of factor j as:

e _ij = (ถ x_i/ถ w_j)ท(w_j/x_i)

Our previous result implies that e _ii ฃ 0, so own-price elasticity is negative. But generally e _ij น e _ji. This is evident because e _ji = (ถ x_j/ถ w_i)ท(w_i/x_j), so e _ij = e _ji only if w_j/x_i = w_i/x_j, which we have no reason to assume. Nonetheless, notice that:

ถ x_i/ถ w_j = e _ij(x_i/w_j)

ถ x_j/ถ w_i = e _ji(x_j/w_i)

Thus by symmetry of cross effects e _ij(x_i/w_j) = e _ji(x_j/w_i), which implies that:

e _ij = e _ji(w_jx_j/w_ix_i)

Defining s_j = w_jx_j/C(w, y), which as we can note is the proportion of total costs spent on factor j and s_i = w_ix_i/C(w, y), the proportion of total costs spent on factor i, then immediately we see that:

e _ij = e _ji(s_j/s_i)

Thus the cross-price elasticities are proportional to each other, with the proportionality factor being s_j/s_i, the ratio of relative shares of the two factor bills in total costs.

It is a simple matter to note that e _ij/s_j is actually the good old Allen elasticity of substitution we derived earlier. Specifically, recall that we defined the Allen elasticity of substitution as:

s ^A_ij = ((ๅ _{iฆ
i}x_i)/x_ix_j)ท|B_ij|/|B|

where |B| is the determinant of the bordered Hessian matrix and |B_ij| is the ijth cofactor for the production function y = ฆ (x₁, x₂, .., x_m). To see the connection, note that from the first order conditions of the cost-minimization problem, we have w_j = l ฆ _j for j = 1, .., m and y = ฆ (x₁, x₂, .., x_m). Thus, totally differentiating all the first order conditions with respect to l and all the x_is:

dw_j = ฆ _jdl + ๅ _i=1^m ฆ _ji dx_ifor j = 1, 2, .., m

dy = ๅ _i=1^m ฆ _i dx_i

can set up the result in matrix form as:

0	ฆ ₁	ฆ ₂	...	ฆ _m	dl		dy
ฆ ₁	ฆ ₁₁	ฆ ₁₂	...	ฆ _1m	dx₁		dw₁
ฆ ₂	ฆ ₂₁	ฆ ₂₂	...	ฆ _2m	dx₂	=	dw₂
:	:	:	:	:	:		:
ฆ _m	ฆ _m1	ฆ _m2	...	ฆ _mm	dx_m		dw_m

where, note, the matrix on the left is merely the bordered Hessian for a production function. Now, in order to derive ถ x_i/ถ w_j for a particular w_j and x_i on a particular isoquant then we can set dw_k = 0 for all k น j (thus keeping all factor prices but the jth fxed) and set dy = 0 (thus staying on the same isoquant). Thus, the system of equations can thus be rewritten as:

0	ฆ ₁	ฆ ₂	...	ฆ _m	ถ l /ถ w_j		0
ฆ ₁	ฆ ₁₁	ฆ ₁₂	...	ฆ _1m	ถ x₁/ถ w_j		0
ฆ ₂	ฆ ₂₁	ฆ ₂₂	...	ฆ _2m	ถ x₂/ถ w_j		0
					:		:
ฆ _j	ฆ _j1	ฆ _j2	...	ฆ _jm	ถ x_j/ถ w_j	=	1
:	:	:	:	:	:		:
ฆ _m	ฆ _m1	ฆ _m2	...	ฆ _mm	ถ x_m/ถ w_j		0

where, note, we are interpreting the ratios dx_i/dw_j as partial derivatives, ถ x_i/ถ w_j. Now, for a particular x_i, in order to obtain ถx_i/ถ w_j, we can apply Cramer's rule:

ถ x_i/ถ w_j = |B_ij|/|B|

where |B| is the determinant of the Hessian matrix, while |B_ij| is the determinant of the Hessian matrix with the column [0, 0, ..0, 1, 0, 0] replacing the ith column of B. Note that as we can expand by the ith column which is all zeroes expect for the jth component (which is 1), then |B_ij| is actually the cofactor of the ijth element of the Hessian matrix, B.

Now, multiplying ถ x_i/ถ w_j by w_j/x_i, we obtain:

e _ij = (ถ x_i/ถ w_j)ท(w_j/x_i) = (w_j/x_i)ท|B_ij|/|B|

Thus, dividing by s_j = w_jx_j/(ๅ _i=1^m ฆ _ix_i), we obtain:

e _ij/s_j = ((ๅ _i=1^m ฆ _ix_i)/x_ix_j)ท|B_ij|/|B|

which is precisely the expression for the Allen elasticity of substitution, s ^A_ij. Finally, turning to the result we obtained earlier that e _ij = e _ji(s_j/s_i), this can be restated as e _ij/s_j = e _ji/s_i, or s _ij^A = s _ji^A, so that the Allen elasticities of substitution are symmetric.

Property (3) is somewhat clearer. Consider the impact of a change in all input prices by a particular scalar (call it l ). Diagramatically, we should expect (as we noted before) that the choice of inputs would remain unchanged - largely because a doubling of all prices will leave the slopes of the isocost curves unchanged. We can see this directly by remembering that the cost function is homogeneous of degree one, i.e. C(l w, y) = l C(w, y). But we know from earlier discussion that if a function is homogeneous of degree r, then its partial derivatives are homogeneous of degree r-1. By Shephard's Lemma, x_i(w, y) is a first partial derivative of a C(w, y), thus demand will be homogenous of degree zero, i.e. x_i(l w, y) = x_i(w, y), doubling all factor prices will not affect the demand for any input.

We can exploit this a bit further. By Euler's Theorem , if a function is homogeneous of degree zero, then the sum of the arguments multiplied by their partial derivatives will be zero, i.e.

ๅ _j=1^m w_jท(ถ x_i/ถ w_j) = 0

Now, this identity is extremely useful. Note that multiplying the sum by 1 = x_i/x_i this can be rewritten:

x_iทๅ _j=1^m w_j/x_iท(ถ x_i/ถ w_j) = 0

which (if x_i is finite and non-zero) implies:

ๅ _j=1^m e _ij = 0

so the sum of price elasticities of demand for factor i is equal to zero.

Now, returning to the earlier identity, applying Shephard's Lemma we should recall that ถ x_i(w, y)/ถ w_j =ถ ²C(w, y)/ถ w_iถw_j. Plugging this back into our Euler's Theorem equation:

ๅ _j=1^m w_jท(ถ ²C(w, y)/ถ w_iถw_j) = 0

or, more generally:

ั_wwC(w, y)ทw = 0

where ั_wwC(w, y) is a matrix of second derivatives of the cost function. The elements on the diagonal are the own-price effects on demands, while those on the off-diagonal are the cross-price effects. By what we have said before, concavity and Young's Theorem implies that ั _wwC(w, y) will be a negative, semi-definite symmetric matrix.

(ii) Output Effects

Let us now turn to changes in desired output, an issue we have been avoiding because of the close association of this theme to Marshallian theory. Cost functions C(w, y) are functions of output, and thus so are demand functions, x(w, y). The questions we are posing are illustrated in Figure 8.3. At a particular desired output level Y₁, the cost-minimizing bundle is e. Suppose now that desired output increases to Y₂. As factor prices are unchanged, it is obvious that conducting the same cost-minimizing exercise, we obtain optimal input bundle eข . If output rises again to Y₃, then the cost-minimizing input bundle is eขข .

cost3.gif (4918 bytes)

Fig. 8.3 - Output-Expansion Path

As output increases from Y₁ to Y₂ and Y₃, we change the cost-minimizing bundle from e to eข and eขข . The curve E passing through e, eข and eขข in Figure 8.3 is referred to as the output-expansion path and traces the different cost-minimizing bundles as we change the level of output. Notice that the slopes of the isoquants at e, eข and eขข are the same (all equal to the factor price ratio, the slope of the isocost curves). Obviously, the isocost curve C₁ associated with the bundle e is below the isocost curve C₂ associated with eข which, in turn, is below C₃ associated with eขข , i.e. C₁ < C₂ < C₃. Thus, immediately we see that a rise in output will increase the costs associated with the cost-minimizing bundle, or ถ C/ถ y ณ 0.

What can we say about the effect of increasing output on factor demands? Diagramatically, in Figure 8.3, we saw that the demand for both capital and labor has increased. But this is not always obvious. Recognize that by Shephard's Lemma:

ถ x_i(w, y)/ถ y = ถ ²C(w, y)/ถ w_iถy

But by Young's Theorem, interchanging the terms in the denominator:

ถ x_i(w, y)/ถ y = ถ ²C(w, y)/ถ yถ w_i = ถ [ถ C(w, y)/ถ y]/ถ w_i

Now, as we saw, ถ C/ถ y ณ 0 by the properties of the cost function. Now, ถ C/ถ y can be interpreted as the marginal cost of output. Thus, whether an increase in output increases or decreases factor demands depends upon whether a rise in price of factor i increases or decreases the marginal cost of output.

It is not clear what will be the sign of ถ x_i/ถ y. We would like it that ถ x_i/ถ y ณ 0 (as is implied in Figure 8.3). In such a case, we refer to the factor as a normal factor. But it can happen that ถ x_i/ถ y < 0, in which case we have an inferior factor. This might happen if, say, the factor was indispensible at low scales of production but is substituted against as higher levels of output are achieved. An argument that might justify this would appeal to phenomona such as specialization, indivisibilities, etc.

Recall that we have already touched upon specialization arguments in our discussion of returns to scale of production functions. Specifically, we argued that differing returns to scale are often justified on the basis of changing factor proportions as output changed. However, we also spoke of pure returns to scale, a technical property of the production function summarized by the notion of "doubling all inputs, etc." It must be clear that now we are not talking about technical properties of scale but economic properties of scale. In other words, we are interested in cost-minimizing points as the output level increases. This may very well allow for changing factor proportions.

We see this clearly in Figure 8.3. The labor-capital ratio at e, denoted by the slope of the ray from the origin (L/K)₁, is different from the labor-capital ratio at eข , represented by the different ray from the origin (L/K)₂. Thus, cost-minimization at different output levels can yield different factor proportions or techniques. Indeed, as long as the output expansion path E is curved in any way, there will be changing factor proportions as scale increases.

Happily, the technical aspects of the production function may, in fact, restrict the type of output-expansion paths we see. Specifically, it can be shown that if the production function is homothetic (and all production functions which are homogeneous of whatever degree are homothetic), then there will not be changing factor proportions along the output-expansion path. In other words, the output expansion path E will necessarily be a ray from the origin. Such a situation is depicted in Figure 8.4, where the cost-minimizing points e, eข and eขข all lie on the same ray from the origin, E, which also represents the output-expansion path.

cost4.gif (3444 bytes)

Figure 8.4 - Output-Expansion Path for Homothetic Function

Thus, although in principle, output-expansion paths may be curved and twisted, most relevant production functions (which are usually homothetic) will nonetheless exhibit linear output expansion paths. Notice that linearity does not rely upon constant returns to scale. Increasing returns and decreasing returns production functions will also have linear expansion paths. Homotheticity of the production function buys us a few more things in this context: for instance, it guarantees that every factor is normal.

(C) Costs and Returns to Scale

As we have demonstrated, the cost function C(w, y) is positively related to the scale of output. However, as we saw in an earlier section earlier, a production function can exhibit different returns to scale. One ought to imagine that the cost function would thus also capture these different returns to scale in one way or another. This is shown in Figure 8.5 below, where we have plotted the cost function C(w₀, y), where output is plotted with respect to y, and factor prices are held fixed at w₀. As we see in Figure 8.5, as output increases, costs increase, but at different speeds.

The easiest way to think of the shape of the cost curve in Figure 8.5 is to recall the typical varying returns-to-scale production function for the one-input, one-output case shown earlier in Figure 3.1. There our production function y = ฆ (x) exhibited first increasing and then decreasing returns to scale as output level rose. The cost function C(w₀, y) drawn in Figure 8.5 is merely a "stretched mirror image" of the production function in Figure 3.1. In Figure 3.1, y was on the vertical axis and x was on the horizontal. Suppose we flip this around so that y is on the horizontal axis and x on the vertical. The resulting shape would be similar to the cost function in Figure 8.5. However, in Figure 8.5, we do not measure factor inputs on the vertical axis but rather costs. However, recall that costs are merely wทx. As factor prices are fixed throughout at w₀, then all we need to do is take our inverted production function and "stretch" x by the scalar w₀. Thus, the total function C(w₀, y) in Figure 8.5 is merely the production function in 3.1 with axes flipped and the vertical axis increments reindexed from x to w₀x.

cost5.gif (4195 bytes)

Figure 8.5 - Cost Function with respect to output

However, we do not have to restrict ourselves to production technology which is one-output, one-input. Indeed, a production function with multiple inputs y = ฆ (x₁, x₂, x₃, .., x_m) would be effectively the same as that depicted in Figure 8.5 because the the cost of a single bundle of factors x at a particular, fixed set of w₀ would still be a single number, w₀ทx and thus the cost function corresponding to any multi-factor production function with increasing and then decreasing returns to scale could still be drawn on a plane as in Figure 8.5.

The properties of increasing, constant and decreasing returns to scale correspond, when viewed from the perspective of the cost function, to decreasing, constant and increasing marginal costs to scale. As we see in Figure 8.5, costs increase as output increases throughout; however, notice that the cost function is first concave and then convex. If we define marginal cost of output as MC = ถ C/ถ y, the slope of the cost function in Figure 8.5, then we see that marginal costs fall as we raise output from zero to y₂ and then begin to rise as we move from y₂ onwards. The marginal cost curve can thus be drawn independently, as we have done in Figure 8.6.

Average costs can also be deduced. By definition, AC = C/Y, thus average costs at any point are captured by the slope of a ray through the origin that passes through it. As we see in Figure 8.5, average costs at y₁ are high, and average costs at y₃ are low as ray O₁ is steeper than ray O₃. Average costs at y₂ and y₄ are the same as they share the same ray, O₂. Notice that O₃ is the flattest ray we can obtain, thus y₃ represents the output level with the lowest average cost. Thus, as we can deduce from Figure 8.5, average costs decline as output rises from zero to y₃ and then rise again after that. The average cost curve is drawn also independently in Figure 8.6. The average cost and marginal costs curves are due originally to Jacob Viner (1931) and thus the curves in Figure 8.6 are sometimes referred to as Viner curves.

cost6.gif (3044 bytes)

Figure 8.6 - Average Cost and Marginal Cost Curves

Notice that y₂ is the inflection point in the cost function in Figure 8.5, thus y₂ represents the point where we move from decreasing to increasing marginal costs, while y₃ is where we move from decreasing to to increasing average costs. As y₂ < y₃, we can define several regions of output: in the region from 0 to y₂, average costs and marginal costs are declining and AC > MC; in the region from y₂ to y₃, we still have AC > MC and average costs declining, but marginal costs are rising; from y₃ onwards, we now have MC > AC and both average and marginal costs increasing.

Notice that the MC and AC curves intersect at y₃ which also happens to the be the point of minimum average cost and that everywhere below y₃, MC < AC and everywhere above y₃, AC > MC. This is obvious diagramatically, but can be proved algebraically. As AC = C(w, y)/y, then at output levels below y₃ (i.e. when AC is declining), we must have ถ (AC)/ถ y < 0. But this translates to by differentiation:

ถ AC/ถ y = [(ถ C/ถ y)ทy - C(w₀, y)]/y² < 0

or, rearranging:

(ถ C/ถ y)/y < C(w₀, y)/y²

so multiplying through by y:

MC = ถ C/ถ y < C(w₀, y)/y = AC

thus MC must lie below AC at output levels below y₃. The analogous exercise can be done for points above y₃ where AC is increasing.

Returning to the relationship between returns and costs, it can be easily inferred from the diagram that different returns to scale correspond to different marginal costs (not average costs). Where we have increasing returns to scale, we have decreasing marginal costs (thus, between 0 and y₂); where we have decreasing returns to scale, we have increasing marginal costs (above y₂). Notice the implication: if we have a production function which has decreasing returns to scale throughout, then both our marginal and average cost functions are always rising; if it has increasing returns throughout, then both the marginal and average cost functions are falling throughout. Finally, a constant returns to scale production function necessarily implies flat AC and MC curves.

Several other properties can be detected about marginal cost. Firstly, we have already deduced earlier that marginal cost is always non-negative, i.e. ถ C/ถ y ณ 0. But we also ran into the difficulty that we could not unambiguously detect what happens to marginal cost when a factor price rises, i.e. we could not tell exactly what ถ [ถ C(w, y)/ถ y]/ถ w_i is because of possible substitution possibilities. However, what if all factor prices rise proportionally? We can establish the following interesting property: namely, that marginal cost is homogenous of degree one in prices, i.e.

ถ C(l w, y)/ถ y = l ถ C(w, y)/ถ y

so that a proportional rise in factor prices will lead to a proportional rise in marginal cost. To obtain this, we need to recall our result that C(w, y) is homogeneous of degree one in prices. This implies, by Euler's theorem, that:

C(w, y) = ๅ _i=1^m (ถ C/ถ w_i)ทw_i

consequently, differentiating with respect to y:

ถ C/ถ y = ถ [ๅ _i=1^m (ถ C/ถ w_i)ทw_i]/ถ y

= ๅ _i=1^m (ถ ²C/ถ w_iy_i)ทw_i

or as ถ ²C/ถ w_iy_i = ถ ²C/ถ y_iw_i by Young's Theorem, then:

ถ C/ถ y = ๅ _i=1^m [ถ (ถ C/ถ y)/ถ w_i]ทw_i

But notice that this just says that marginal cost (ถ C/ถ y) can be expressed as a sum of its arguments (w_i) multiplied by their derivatives (ถ (ถ C/ถ y)/ถ w_i). Thus, by Euler's Theorem, ถ C/ถ y is homogeneous of degree one, i.e. doubling all factor prices will double marginal costs.

This fact caused a dilemma in early economic theory. Specifically, it commonly stipulated that it was an empirical fact (albeit rooted in armchair speculation) that agriculture exhibited decreasing returns to scale while manufacturing always exhibited increasing returns to scale. This idea can be found in the early work of the Classical economists. Alfred Marshall reiterates this idea:

"in those industries which are not engaged in raising raw produce [i.e. manufacturing] an increase of labour and capital generally gives a return increased more than in proportion; and further this improved organization tends to diminish or even override any increased resistence which nature may offer to raising increased amounts of raw produce." (A. Marshall, 1890: p.265)

The dilemma arises in that a firm whose technology exhibits increasing returns throughout implies that there are decreasing marginal costs throughout.

If the technical returns to scale properties of the production function can be captured by the cost curve C(w₀, y), can we also obtain the rest of the properties of the production function (e.g. convexity of the isoquants, etc.) via the cost function? Indeed, we can. As Hirofumi Uzawa (1964) has shown, one can obtain isoquants of the production function from the revaled cost-minimizing choices. Changing factor prices continuously for a given level of output, the cost-minimizing choices will trace out the corresponding isoquant of that level of output. Thus, even if we do not know the isoquants, we can hypothetically trace them out from the cost-minimizing choices of the producers.

However, a caveat is in order: while we can recover convex isoquants by tracing the cost-minimizing points, we cannot recover non-convex isoquants by these means. In other words, if the true production function has non-convex input requirement set V(y), then by Uzawa's exercise, we can only trace out the convex hull of V(y). Why this is so makes sense geometrically for isoquants: the non-convex portions are never chosen by cost-minimizing producers thus we can never "see" them, "the concave portions of indifference curves [and isoquants], if they exist, must forever remain in unmeasurable obscurity." (Hotelling, 1935).

(D) Factor Price Frontiers

We can continue exploiting the relationship between cost functions and production functions by turning to factor price frontiers. The concept is due to Paul A. Samuelson (1953, 1957), albeit only shown diagramatically in Samuelson (1962) and D.G. Champernowne (1953). The factor price frontier is a central tool to illustrate the famous Cambridge Capital Controversy (cf. G.C. Harcourt (1972); see also J. Hicks (1965) and H. Kurz and N. Salvadori (1995)). Strictly speaking, in most applications, the factor price frontier is conceived in reference to economy-wide equilibrium. However, our focus here is confined to the producer's cost-minimizing decision, thus our factor price frontier will contain somewhat less information. It remains, however, quite relevant in that, as we shall see, we can conceive of the factor price frontier simply as the upper contour set of the cost function C(w, y).

Dimensional restrictions allow us to derive the factor price frontier only for the two-factor case, thus we shall restrict ourselves to our canonical case, Y = ฆ (K, L) with costs defined by C = rK + wL. The factor price space is the w-r space, shown in Figure 8.7. We want to derive a relationship between returns to labor (w) and returns to capital (r) for a given level of costs. We can express the cost equation C = rK + wL as:

w = C/L - (K/L)r

Now, for a given technique K/L and a given cost level C, we can derive a factor-price curve which gives the different combinations of w and r which, at a given K/L, yield the same cost C.

[Note: when dealing with applications to economy-wide equilibrium, economists normally assume constant returns to scale so that they can write Y = rK + wL by Euler's theorem, and then proceed to trace out factor price curves and frontiers from there; in that case, the factor price frontier can be conceived as the dual of the production function and an economy-wide equilibrium locus. We shall refrain from that maneouvre here, and concentrate solely on cost; albeit see our discussion of the Cambridge Capital Controversy.]

In Figure 8.7, we have a series of factor-price curves all for a given capital-labor ratio k = K/L, thus, they all have slope k = K/L. Notice that the vertical intercept of the factor-price curve denotes a particular cost level, C/L (the horizontal intercept is C/K). Thus, as Cข < C* < Cข ข , then factor price curve Cข represents lower costs than C* which in turn represents lower costs than Cข ข .

cost7.gif (3864 bytes)

Fig. 8.7 - Factor Price Curves for One Technique

To understand the meaning of a factor price curve, it is useful to imagine that we have a Leontief production technology - thus, constant factor proportions, which is captured here by k. Suppose w = w₁ and r = r₁, so that we are at e₁ in Figure 8.7. At these factor prices, total costs are C*. Suppose now w decreases to w₂ so that we have factor price combination (w₂, r₁), shown in Figure 8.7 by e_1ข. As nothing else changes, costs will fall, thus we move to the lower factor-price curve, Cข . However, in order to return to the same cost level C*, we therefore need to raise r to r₂. Thus, e₁ = (r₁, w₁) and e₂ = (r₂, w₂) represent the same amount of total costs. Similarly, if we start from e₁ and raise r to r₃, then total costs will rise to Cข ข . Thus, lowering w to w₃, we will return to the same cost level C*. Thus, e₁ = (r₁, w₁) and e₂ = (r₃, w₃) represent the same total costs C*. More straightforwardly, note that everywhere on a given factor price curve, we have wL + rK = C*, thus as C*, K and L are given, then if w goes down, r must go up to keep costs at C*.

In Figure 8.8 we have drawn two different factor curves corresponding to two different capital-labor ratios, k₁ = K₁/L₁ and k₂ = K₂/L₂ and the same cost level, which we normalize to C = 1 (ingore the dashed line for the moment). Thus, factor-price curve associated with k₁ has slope - k₁, vertical intercept 1/L₁ and horizontal intercept 1/K₁ whereas the other factor price curve is associated with k₂ has slope -k₂, vertical intercept 1/L₂ and horizontal intercept 1/K₂. Notice that the factor price curve k₁ is steeper than k₂, thus we know that k₁ > k₂, i.e. the steeper the factor price curve, the more capital-intensive (less labor-intensive) the corresponding k is.

cost8.gif (3591 bytes)

Figure 8.8 - Factor Price Curves for Two Techniques

Reading Figure 8.8 is a little tricky due to the normalization to unit costs, Nonetheless, consider the capital-intensive technique k₁. The formula for this factor price curve is:

w = 1/L₁ - (K₁/L₁)r

Thus, for a given r (or given w), we can find the corresponding wage w (or corresponding r) that yields total costs 1 at that given technique k₁. Consider wage w₁; the corresponding rental rate on capital is r₁, shown at e₁. Similarly, at wage w₂, the corresponding rental rate on capital that yields unit costs for technique k₁ is r₂, shown by point e₂. In contrast, notice that the factor price curve for the labor-intensive technique k₂ is governed by w = 1/L₂ - (K₂/L₂)r. Following the same logic, then when wage is w₂, the rental rate on capital that yields unit costs for technique k₂ is r_2ข, as shown by point f₂. Notice that at w₃, the corresponding rental rates for both techniques is r₃, as shown by point e₃.

The usefulness of the factor price curves is that we can trace the cost-minimizing choice of technique by the "rule of the outermost". Specifically, at any given real wage w, we can detect what the chosen cost-minimizing technique is by choosing the technique that yields the highest r. Thus, in Figure 8.7, when w = w₁, the cost-minimizing choice of technique is k₁. When w = w₃, the cost-minimizing choice of technique is both k₁ and k₂ (we are indifferent between techniques). Finally, when w = w₂, the cost-minimizing choice of technique is k₂ and not k₁.

The "rule of the outermost" may not seem to make much intuitive sense in this last case: at w = w₂, the r corresponding to k₁ is r₂ while the r corresponding to k₂ is r₂ข. But if r₂ข> r₂, don't we have greater costs using technique k₂ than technique k₁? No. Total costs, as we noted, are the same on both factor price curves k₁ and k₂ as we have normalized costs to 1. In other words, costs to using technique k₁ at factor prices r₂/w₂ are the same as the costs to using technique k₂ at factor prices r₂ข/w₂.

However, it is precisely because total costs are the same on both factor price curves that we can say that k₂ is a cost-minimizing choice of technique when w = w₂. To see why, note that if we decided to use technique k₂ at factor prices r₂/w₂, i.e. if we were forced to stay at point e₂, then we would be off the factor price curve k₂. In other words, we would not be incurring unit costs at e₂ but rather less than unit costs. We can see this via the parallel dashed line passing through e₂ in Figure 8.8, which represents technique k₂ when forced to use prices r₂/w₂. Notice that the costs represented by this dashed curve are lower than unit costs (compare the intercepts, C/L₂ for the dashed line and 1/L₂ for the unit factor price curve for technique k₂; obviously, C < 1). In other words, using technique k₂ at factor prices r₂/w₂ yields lower costs (i.e. C₂) than using technique k₁ at factor prices r₂/w₂ (which yields unit costs). The "rule of the outermost", therefore, merely captures the idea of cost-minimization, while keeping total costs normalized to 1.

The meaning of Figure 8.8 can best be gathered by reading it in conjunction with the more familiar Figure 8.9, where we have depicted the activity analysis unit isoquant when we only have two techniques, k₁ and k₂. If relative factor prices are low at r₁/w₁, we obtain a whole series of isocost curves with slope - r₁/w₁. Notice that minimizing costs in the isoquant space, we would choose the capital-intensive technique k₁, yielding isocost curve C₁. If factor prices increased to r₂/w₂ and we stayed on the same technique k₁, the isocost curve would swivel to C₂ in Figure 8.9. But this is not the cost-minimizing thing to do. Facing prices r₂/w₂, costs would be minimized if we moved to the labor-intensive technique k₂ and corresponding isocost curve C₂ข.

Finally, notice that if factor prices are r₃/w₃, the isocost curves are parallel to the isoquant segment between k₁ and k₂. Cost-minimization, in this case, leaves us with an indeterminate input choice: k₁, k₂ or any convex combination of these would all be cost-minimizing at those factor prices. Notice that point e₃ in Figure 8.8 is often called a switchpoint as it denotes the the factor prices at which we move from one technique to another. Notice that it is quite sensitive: a slight rise in w leads to a complete switch to k₁, a slight fall in w leads to a complete switch to k₂.

cost9.gif (4871 bytes)

Fig. 8.9 - Cost-Minimization with Two Activities

The cost-minimization exercise in isoquant-isocost space in Figure 8.9 is precisely captured by the "rule of the outermost" with factor-price curves in Figure 8.8. When wages are w₁, the rule of the outermost tells us the optimal technique is k₁, thus k₁ is chosen. As wages decline to w₂, the rule of the outermost tells us we choose k₂. This is effectively equivalent to a unit-cost-normalized version of the move from r₁/w₁ to r₂/w₂ in the non-normalized case in Figure 8.9. Notice also that when the wage is w₃, the rule of the outermost tells us that we are indifferent between k₁ and k₂: this corresponds to the indeterminacy in r₃/w₃ in Figure 8.9. Thus, although our normalizations make them seem different, choosing cost-minimizing isocost curves in the capital-labor space as in Figure 8.9 yields exactly the same information about cost-minimizing input choices as when we follow the "rule of the outermost" in factor price space. The information in both diagrams is effectively the same.

Adding more techniques in capital-labor space translates into adding more activity rays to the point where we might get a smooth isoquant. Similarly, in factor price space, adding more techniques, would add more factor-price curves. In the limit, as the number of activities increases to infinity, we would be able to trace a factor price frontier as the envelope of the factor price curves. This is shown in Figure 8.10 by the thick line C(w, y₀) = 1. Notice that the factor price frontier follows the "rule of the outermost" for all the factor price curves. [the "factor price frontier" is Samuelson's (1962) term; Hicks (1965: p.150) calls it the "wage frontier"; Neo-Ricardians (e.g. Kurz and Salvadori, 1995: p.50) tend to call it the "wage-profit frontier".]

cost10.gif (3916 bytes)

Fig. 8.10 - The Factor-Price Frontier

The reason for labelling the factor price frontier as C(w, y₀) = 1 is precisely because it represents the combinations of factor prices that, by cost-minimization, yield unit costs. In fact, it should be detected that the the factor price frontier in Figure 8.10 represents a unit contour of the cost function for a given output level y₀. This is intuitive. We previously demonstrated that the cost function C(w, y) was concave with respect to factor prices, thus, in effect, the cost function is a "hill" over factor prices. We know that the upper contour set of a concave function is convex. This is precisely what we have here: the factor price frontier is merely the unit contour line of the "cost hill" and it is, indeed, convex.

Computing the slope of the factor price frontier is obtained by differentiating the cost function at a given output level, C(w, y₀), with respect to factor prices. Using the implicit function rule:

ถ w/ถ r = -[ถ C(w, y₀)/ถ r]/[ถ C(w, y₀)/ถ w]

But, by Shephard's Lemma, we know that ถ C(w, y₀)/ถ w = L and ถ C(w, y₀)/ถ r = K, thus ถ w/ถ r = -K/L. Thus, at any particular factor price combination in Figure 8.10, the corresponding slope of the factor price frontier is the (negative of) the factor input ratio K/L that minimizes costs at those factor prices. This, of course, is precisely the "rule of the outermost" that is traced by the factor price frontier.

Notice two other interesting results. The first (obvious) one, as we have already seen, is that if w falls and r rises (i.e. r/w rises), then the cost-minimizing choice of inputs will be more labor-intensive, e.g. we move from capital-intensive k₁to labor-intensive k₂ in Figure 8.10. This is the standard result of the derived demand for factors we obtained earlier: the demand for a factor falls when its price rises. Notice a second implication. A ray from the origin in factor price space will have slope w/r, while the wage-profit frontier will have slope K/L. Consequently, we can measure the curvature of the factor price frontier by the formula:

h = [ถ ln (w/r)/ถ ln (K/L)]

which one will recognize immediately, as w/r = ฆ _L/ฆ _K by cost-minimization, to be the inverse of the elasticity of substitution, s , i.e. h = 1/s . Thus, when the elasticity of substitution is very low, e.g. s = 0 as in the Leontief case, then h = ฅ , i.e. the factor-price frontier is completely linear. But we have already seen this: as noted earlier, when there is a single technique, as in Leontief, the factor price frontier collapses to a single factor price curve which is, of course, linear. The converse applies: factor price frontiers take on an L-shape when s = ฅ , i.e. perfect substitution among factors. Thus, the more curved isoquants are, the less curved the factor price frontier is.

Back

Top

Selected References

Home	Alphabetical Index	Schools of Thought	Surveys and Essays
Web Links	References	Contact	Frames